Smoothing the Evolution of the Spectral Parameters in Speech Coders

نویسنده

  • Mohammad R. Zad-Issa
چکیده

New generation of speech coders have to achieve two goals: efficient use of bandwidth and high speech quality. The objective of this thesis is to improve the modelling of speech signal within the constraints of a low bit rate coder. Many speech coding algorithms use Linear Prediction (LP) coefficients to describe the power spectrum of the speech. These parameters are obtained for blocks of input samples using standard linear prediction analysis technique. Changes in the speech power spectrum results in the evolution of the LP parameters. However, conventional linear prediction analysis has shortcomings that contribute to the frame-to-frame variation of the LP parameters. These undesired variations affect the performance of the parameters coding and the perceptual quality of the synthesized signal. For voiced speech, efficient coding of the excitation pitch pulses relies on the similarity of successive pitch waveforms. The performance of this coding stage is also jeopardized by LP parameters variations. The goal of this thesis is to modify the traditional linear prediction analysis in such way that the fluctuations of the LP coefficients are reduced, while the pitch pulse shape evolves slowly. These modifications can lead to an increase in the coding efficiency and/or an improvement in the speech quality. Two different methods have been developed for this purpose. In the first approach we derive the LP parameters such that the glottal excitation model matches as closely as possible a target waveform. The latter contains slowly evolving pulses representing voiced speech excitation. The simulation results indicate that the target matching method results in an increase in the pitch prediction gain which is a measure of similarity of successive pitch pulses. The frame-to-frame variation of the LP coefficients is also lowered with respect to the conventional linear prediction analysis. In the second method, we enforce the smoothness on the evolution of LP parameters by directly including their variation in the LP error function. A novel scheme to dynamically control the contribution of this additional term is also proposed. Experiments indicate that this method can considerably reduce the fluctuation of LP parameters while the overall prediction gain of the LP filter is maintained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Very low rate speech coding using temporal decomposition and waveform interpolation

In very low rate coding the aim is to accurately represent speech characteristics as efficiently as possible. High coding gains for the spectral features can be achieved through the use of temporal decomposition. Waveform interpolation coders accurately represent the excitation using characteristic waveforms (CWs) extracted at a constant rate. In this paper, the two approaches are combined into...

متن کامل

Ultra Low Bit-Rate Coders

In this chapter, we present the definition and principles of ultra-low bit-rate coders. Here the emphasis is to point to the fact that this class of coders is typically the ‘vocoders’, which are ‘parametric’ coders that are essentially linear-prediction (LP) based vocoders. This is in contrast to the ‘waveform’ coders, which operate at the higher bit-rates. Among the various frameworks employed...

متن کامل

The Function of Pitch Range Variations in Samples of Emotional Expressions in Persian

This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...

متن کامل

Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants

Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998